Efficient HPSG Parsing with Supertagging and CFG-Filtering

نویسندگان

  • Takuya Matsuzaki
  • Yusuke Miyao
  • Jun'ichi Tsujii
چکیده

An efficient parsing technique for HPSG is presented. Recent research has shown that supertagging is a key technology to improve both the speed and accuracy of lexicalized grammar parsing. We show that further speed-up is possible by eliminating non-parsable lexical entry sequences from the output of the supertagger. The parsability of the lexical entry sequences is tested by a technique called CFG-filtering, where a CFG that approximates the HPSG is used to test it. Those lexical entry sequences that passed through the CFG-filter are combined into parse trees by using a simple shift-reduce parsing algorithm, in which structural ambiguities are resolved using a classifier and all the syntactic constraints represented in the original grammar are checked. Experimental results show that our system gives comparable accuracy with a speed-up by a factor of six (30 msec/sentence) compared with the best published result using the same grammar.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HPSG Supertagging: A Sequence Labeling View

Supertagging is a widely used speed-up technique for deep parsing. In another aspect, supertagging has been exploited in other NLP tasks than parsing for utilizing the rich syntactic information given by the supertags. However, the performance of supertagger is still a bottleneck for such applications. In this paper, we investigated the relationship between supertagging and parsing, not just to...

متن کامل

Compiling an HPSG-based grammar into more than one CFG

Recently, the performance of HPSG parsing has been improved so that the parsers can be applied to real-world texts. CFG filtering is one of the techniques which contributed to this progress. It improved parsing speed by filtering impossible parse trees by using the CFG compiled from a given HPSGbased grammar. However, there is a limit in the speed-up. This is because the compiled CFG grows into...

متن کامل

A log-linear model with an n-gram reference distribution for accurate HPSG parsing

This paper describes a log-linear model with an n-gram reference distribution for accurate probabilistic HPSG parsing. In the model, the n-gram reference distribution is simply defined as the product of the probabilities of selecting lexical entries, which are provided by the discriminative method with machine learning features of word and POS n-gram as defined in the CCG/HPSG/CDG supertagging....

متن کامل

Efficient HPSG Parsing Algorithm with Array Unification

This paper presents a method for improving parsing performance of parsers for HPSG. The method was obtained by extending Torisawa’s parsing method for HPSG. His parsing method utilizes a CFG compiled from a given HPSG-based grammar, and the parser predicts the possible parse trees with the CFG. Since the amount of unification is reduced because of this prediction, parsing performance is improve...

متن کامل

Efficacy of Beam Thresholding, Unification Filtering and Hybrid Parsing in Probabilistic HPSG Parsing

We investigated the performance efficacy of beam search parsing and deep parsing techniques in probabilistic HPSG parsing using the Penn treebank. We first tested the beam thresholding and iterative parsing developed for PCFG parsing with an HPSG. Next, we tested three techniques originally developed for deep parsing: quick check, large constituent inhibition, and hybrid parsing with a CFG chun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007